Resource Wrapper Tutorial

Interactions with MLDB occurs via a REST API. Interacting with a REST API over HTTP from a Notebook interface can be a little bit laborious if you're using a general-purpose Python library like requests directly, so MLDB comes with a Python library called pymldb to ease the pain.

pymldb does this in three ways:

  • the Python Resource class: this is simple class which wraps the requests library so as to make HTTP calls to the MLDB API more friendly in a Notebook environment. This tutorial shows you how to use it.
  • the %mldb magics: these are Jupyter line- and cell-magic commands which allow you to make raw HTTP calls to MLDB, and also provides some higher-level functions. Check out the Cell magic Tutorial for more info on the %mldb magic system.
  • the Python BatFrame class: this is a class that behaves like the Pandas DataFrame but offloads computation to the server via HTTP calls. Check out the BatFrame Tutorial for more info on the BatFrame.

Getting started

A Resource object is just an extremely cheap-to-create, immutable proxy for a single URL.


In [16]:
from pymldb.resource import Resource
r = Resource("http://localhost")
print r


http://localhost

You can use a Resource object to quickly create new Resource objects to refer to different URLs by calling functions or passing in arguments, chaining the calls:


In [17]:
print type(r), r
x = r.x
print type(x), x
y = x("y")
print type(y), y
z = r("and").so("on")("and").so.on
print type(z), z


<class 'mldb.resource.Resource'> http://localhost
<class 'mldb.resource.Resource'> http://localhost/x
<class 'mldb.resource.Resource'> http://localhost/x/y
<class 'mldb.resource.Resource'> http://localhost/and/so/on/and/so/on

Making HTTP requests

Once you have a Resource object that refers to a URL you care about, you can use it to issue HTTP requests:


In [18]:
dataset_types = r.v1.types.datasets
dataset_types.get()


Out[18]:
GET http://localhost/v1/types/datasets
200 OK
[
  "beh", 
  "beh.binary", 
  "beh.live", 
  "beh.mutable", 
  "beh.ranged", 
  "embedding", 
  "merged", 
  "sqliteSparse", 
  "transposed"
]

The HTTP request is performed via the Python requests library: arguments to get(), post(), put() and delete() are just delegated to the corresponding requests function. The only thing that Resource does to the result is patch it so it will display prettily in a Notebook, as above.

Convenience methods

Resource objects provide three convenience methods for interacting with MLDB: get_query(), put_json() and post_json():


In [19]:
#keyword arguments to get_query() are appended to the GET query string
r.v1.types.get_query(x="y")


Out[19]:
GET http://localhost/v1/types?x=y
200 OK
[
  "algorithms", 
  "functions", 
  "datasets", 
  "procedures", 
  "plugins"
]

In [20]:
sample_dataset = r.v1.datasets("sample")
sample_dataset.delete()
#dictionaries arguments to put_json() and post_json() are sent as JSON via PUT or POST
sample_dataset.put_json( {"type": "beh.mutable"} )


Out[20]:
PUT http://localhost/v1/datasets/sample
201 Created
{
  "status": {
    "rowCount": 0, 
    "valueCount": 0
  }, 
  "config": {
    "type": "beh.mutable", 
    "id": "sample"
  }, 
  "state": "ok", 
  "type": "beh.mutable", 
  "id": "sample"
}

Putting it all together

Now that you've seen the basics, check out the Predicting Titanic Survival demo to see how to use the Resource class to do machine learning with MLDB.


In [ ]: